Maximum Realisable Performance: a Principled Method for Enhancing Performance by Using Multiple Classiiers Maximum Realisable Performance: a Principled Method for Enhancing Performance by Using Multiple Classiiers in Variable Cost Problem Domains
نویسندگان
چکیده
A novel method is described for obtaining superior classiication performance over a variable range of classiication costs. By analysis of a set of existing classiiers using a receiver operating characteristic (ROC) curve, a set of new realisable classiiers may be obtained by a principled random combination of two of the existing classiiers. These classiiers lie on the convex hull that contains the original ROC points for the existing classiiers. This hull is the maximum realisable ROC (MRROC). A theorem for this method is derived and proved from an observation about ROC data, and experimental results verify that a superior classii-cation system may be constructed using only the existing classiiers and the information of the original ROC data. This new system is shown to produce the MRROC, and as such provides a powerful technique for improving classiication systems in problem domains within which classii-cation costs may not be known a priori Lovell et al., 1997b, Lovell et al., 1997a].
منابع مشابه
Realisable Classifiers: Improving Operating Performance on Variable Cost Problems
A novel method is described for obtaining superior classification performance over a variable range of classification costs. By analysis of a set of existing classifiers using a receiver operating characteristic (ROC) curve, a set of new realisable classifiers may be obtained by a random combination of two of the existing classifiers. These classifiers lie on the convex hull that contains the o...
متن کاملImproved Hoeeding-style Performance Guarantees for Accurate Classiiers
We extend Hoeeding bounds to develop superior probabilistic performance guarantees for accurate classiiers. The original Hoeeding bounds on classiier accuracy depend on the accuracy itself as a parameter. Since the accuracy is not known a priori, the parameter value that gives the weakest bounds is used. We present a method that loosely bounds the accuracy using the old method and uses the loos...
متن کاملEffects of Far- and Near-Field Multiple Earthquakes on the RC SDOF Fragility Curves Using Different First Shock Scaling Methods
Typically, to study the effects of consecutive earthquakes, it is necessary to consider definite intensity levels of the first shock. Methods commonly used to define intensity involve scaling the first shock to a specified maximum interstorey drift. In this study the structure’s predefined elastic spectral acceleration caused by the first shock is also considered for scaling. This study aims to...
متن کاملEnhancing Learning from Imbalanced Classes via Data Preprocessing: A Data-Driven Application in Metabolomics Data Mining
This paper presents a data mining application in metabolomics. It aims at building an enhanced machine learning classifier that can be used for diagnosing cachexia syndrome and identifying its involved biomarkers. To achieve this goal, a data-driven analysis is carried out using a public dataset consisting of 1H-NMR metabolite profile. This dataset suffers from the problem of imbalanced classes...
متن کاملDeriving biased classifiers for better ROC performance
ROC analysis makes it possible to evaluate how well classiiers will perform given certain misclas-siication costs and class distributions. Given a set of classiiers, it also provides a method for constructing a hybrid classiier that optimally uses the available classiiers. Now in some cases it is possible to derive multiple classiiers from a single one, in a cheap way, and such that these class...
متن کامل